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We investigate the evolution of Boolean networks subject to a selective pressure which favors 
robustness against noise, as a model of evolved genetic regulatory systems. By mapping the evo- 
lutionary process into a statistical ensemble and minimizing its associated free energy, we find the 
structural properties which emerge as the selective pressure is increased and identify a phase tran- 
sition from a random topology to a "segregated core" structure, where a smaller and more densely 
connected subset of the nodes is responsible for most of the regulation in the network. This seg- 
regated structure is very similar qualitatively to what is found in gene regulatory networks, where 
only a much smaller subset of genes — those responsible for transcription factors — is responsible 
for global regulation. We obtain the full phase diagram of the evolutionary process as a function of 
selective pressure and the average number of inputs per node. We compare the theoretical predic- 
tions with Monte Carlo simulations of evolved networks and with empirical data for Saccharomyces 
cerevisiae and Escherichia coli. 



I. INTRODUCTION 

Many large-scale dynamical systems are composed of 
elementary units which are noisy, i.e., can behave non- 
deterministically, but nevertheless must behave globally 
with some degree of predictability. A paradigmatic ex- 
ample is gene regulation in the cell, which is a system of 
many interacting agents — genes, mRNA, and proteins 
— which are subject to stochastic fluctuations. What 
makes gene regulation particularly interesting is that it 
is assumed to be under evolutionary pressure to preserve 
its dynamic memory against stochastic fluctuations p]- 
3J. Many important cellular processes require such re- 
liability, such as circadian oscillations |2J. Furthermore, 
in multicellular organisms, errors in signal transduction 
can potentially lead to catastrophic consequences, such 
as embryo defects or cancer [TJ [4]. Since the source of 
noise cannot be fully removed [5], a gene regulation sys- 
tem must adopt characteristics which compensate for the 
unavoidably noisy nature of its elements. Since they are 
a product of natural selection, these characteristics must 
emerge from random mutations and subsequent selection 
based on fitness. A central question concerns the gen- 
eral large-scale features which are likely to emerge in this 
scenario that result in reliable function under noise. In 
this work, we study the emergence of robustness against 
noise in networks of Boolean elements which are subject 
to selective pressure, functioning as a model for evolved 
gene regulatory systems. We show that the system un- 
dergoes a structural phase transition at a critical value 
of selective pressure, from a totally random topology to 
a "segregated-core" structure, where a smaller and more 
densely connected subset of the network is responsible 
for the regulation of most nodes in the network. This 
characteristic is present to a significant degree in gene 
regulatory systems of organisms such as yeast and Es- 
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cherichia coli, in which all the regulation is done by a 
much smaller (and denser) subset of the network, com- 
prised of transcription factor genes. 

Boolean networks (BNs) have been used extensively 
to model gene regulation [6-9J. The Boolean value on a 
given node represents the level of concentration of pro- 
teins encoded by a gene, which in the simplest approx- 
imation can be either "on" or "off." The regulation of 
genes by other genes is represented by Boolean functions 
associated with each node, which depend on the state 
of other nodes called the inputs of the function. The 
dynamics on these networks serve as a model for the mu- 
tual regulation of genes which control the metabolism of 
cells in an organism. Gene regulation is composed of spe- 
cific steps involving the production of proteins and other 
metabolites, which need to be carried out in specific se- 
quences and under certain conditions. During each of 
these steps the dynamics is subject to stochastic fluctu- 
ations [2j |3j [10], since the number of proteins involved 
can be very low [2j [11], and the whole process lacks an 
inherent synchronization mechanism. In order for the 
regulation process to work reliably, the network must 
possess some degree of robustness against these pertur- 
bations [12J. Indeed, the investigation of real regulatory 
networks modelled as BNs, such as the one responsi- 
ble for the yeast cell cycle [13], revealed a remarkable 
degree of robustness, where most trajectories in state 
space lead to the same attractor, regardless of the ini- 
tial conditions. Similar results were also obtained for 
the segment polarity regulatory network in Drosophila 
melanogaster [T^ [15], which showed that wild- type at- 
tractors are significantly robust to different initial con- 
ditions and perturbations, and seem to depend only on 
general topological characteristics of the network, instead 
of specific functional details. However, the general fea- 
tures which make BNs robust against different types of 
perturbations are still being identified. 

Perhaps the simplest form of perturbation one can con- 
sider is a "flip" of a single node in the network, and the 
propagation of flips which result from it. This corre- 
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sponds to the situation where the stochastic noise is very 
weak, and can be modeled as a single flip event. After 
the perturbation, the system has an arbitrary amount 
of time to recover (if it recovers), and different pertur- 
bations do not build up. Many authors have considered 
the robustness against perturbations of this type, includ- 
ing Kauffmann [6j |7] who was the first to propose ran- 
dom Boolean networks (RBNs) — networks with fully 
random topology and functions — as a model of gene 
regulation. According to this type of perturbation, the 
dynamics of RBNs [9 J can belong to one of two phases, 
depending on the number of inputs per node K: A frozen 
phase (K < 2) where the perturbation propagates sub- 
linearly in time and eventually dies out; and a "chaotic" 
phase (K > 2) where the perturbation grows exponen- 
tially and eventually reaches the entire system. A crit- 
ical line exists at K = 2, where the perturbation grows 
algebraically, and features from both phases are simulta- 
neously observed. 

Although RBNs in the frozen phase and on the critical 
line show features which can be interpreted as robustness 
in some sense, they fall short of being convincing models 
for gene regulation. Actual gene regulation networks are 
not random and show a high degree of topological (TB] 
and functional [17J organization which are not present 
in simple RBNs. Conceivably, these features arise out 
of stringent requirements to perform specific tasks and 
of types of robustness which are more demanding than 
the containment of single flip perturbations. As an at- 
tempt at producing more complete models, many authors 
investigated the evolution of BN systems, where the fit- 
ness criterion is some form of robustness against pertur- 
bation which is not inherent to RBNs. The majority of 
authors assumed single flips as the only type of noise, 
but considered different types of responses as fitness cri- 
teria [T8H25] , most of which are related to the capacity of 
the network to display the same dynamical pattern after 
a single-flip. In particular, in [23J it was found that if the 
fitness criterion is the ability to return to the same attrac- 
tor after the perturbation, the evolved networks always 
achieve maximum fitness. Furthermore these networks 
with maximum fitness span a huge portion of configura- 
tion space, and show a high degree of variability. This 
means not only that this type of robustness can evolve, 
but also that it is not a very demanding task for the 
evolutionary process. 

In this work, we consider the arguably more realistic 
situation where the perturbations are caused by tran- 
scriptional noise, which can be arbitrarily frequent [26J. 
In this scenario, the effects of noise can overlap and build 
up in time. The appropriate fitness criterion remains 
whether or not the network is capable of performing some 
predefined dynamical pattern, but this is a task which be- 
comes much more complicated. In fact, it can be shown 
that for networks which are sparse, i.e., the average num- 
ber of inputs per node is some finite number, perfect 
robustness can never be achieved, and some amount of 
deviations, or "errors," in the dynamics are always going 



to exist [571 121] • Instead, one measures robustness not 
only by the amount of existing errors, but also by the 
ability of the system to not be overtaken by them and 
consequently lose all memory of its dynamical past — 
i.e., to become ergodic. This type of robustness is much 
stronger than, and not necessarily related to the ability of 
the system to contain single-flip perturbations. This was 
shown in [29J for RBNs subject to transcriptional noise, 
for which neither phase (chaotic or frozen) is robust, and 
both display ergodic behavior, for any non-zero value of 



Furthermore, unlike pti2T| f23| [23]. in this work we 
also consider the cost which is associated with different 
levels of robustness. It is generally the case that improved 
robustness can be obtained by introducing redundancy 
or some other mechanism that counteracts the effect of 
noise, which increases the overhead in the system. This 
added overhead can impact negatively on the fitness of 
the organism, which needs to spend more energy or more 
time to perform the same task. Therefore the trade-off 
between overhead and robustness is also driven by the 
evolutionary process. In this work, this overhead is con- 
trolled by fixing the average in-degree during the evo- 
lutionary process, which becomes an external parameter. 
By selecting the appropriate value, one automatically de- 
termines a selective pressure that yields the correspond- 
ing trade-off. 

Our main result is that under transcriptional noise, 
the selective pressure can have a very noticeable effect 
on large-scale properties of the system: If it is large 
enough, it triggers a structural phase transition, where 
networks change from a random topology to a segregated- 
core structure, with most nodes being regulated by a 
smaller and denser subset of the network. This ob- 
served segregated-core topology is strikingly similar (even 
if qualitatively so) to what is observed in most real gene 
regulation networks; namely, genes are separated into 
two classes: target genes, and those which regulate tran- 
scription factors. Only transcription factor genes are re- 
sponsible for regulation of other genes, and they are usu- 
ally orders of magnitude smaller in number than target 
genes [30] , 

This work is divided as follows. We begin in Sec. [TT|by 



presenting the model, and in Sec. [Hi] we define the evolu- 
tionary process and map it into an equivalent Gibbs en- 
semble. We then parametrize the topological character- 
istics of the system as a stochastic blockmodel in Sec. |IV[ 
and obtain an expression for its entropy. In Sec. [V] we de- 
scribe the technique used to minimize the free energy. We 
follow in Sec. I VII with the characterization of the exist- 
ing phase transition and obtain the phase diagram. We 
perform comparisons with Monte Carlo simulations in 
Sec. [VII] and with the gene regulatory networks of yeast 
and E. coli in Sec. |VIII| Finally, we conclude with a dis- 
cussion. 
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II. THE MODEL 

A Boolean network [6j[9] is a directed graph of N nodes 
representing Boolean variables a G {1,0}^, which are 
subject to a deterministic update rule, 

a i (t+l) = f i (a(t)) (1) 

where fa is the update function assigned to node z, which 
depends exclusively on the states of its inputs. At a given 
time step all nodes are updated in parallel. 

We include noise in the model by introducing the prob- 
ability P that at each time step a given input has its 
value "flipped": <j 3 ; — 1 — Oj, before the output is com- 
puted [29]. This probability is independent for all inputs 
in the network, and many values may be flipped simulta- 
neously. The functions on all nodes are taken to be the 
majority function, defined as 

/,(K})=( Uf ^^ > " i/2 ' (2) 
I otherwise, 

where ki is the number of inputs of node i. It is assumed 
throughout the paper that the values of ki are always 
odd [31]. This is so chosen because odd- valued majority 
functions are optimal, since no other function performs 
better against noise [32] . By using Eq. [2j we essentially 
remove the choice of functions from the evolutionary pro- 
cess and concentrate solely on topological aspects. 

Starting from an initial configuration, the dynamics of 
the system evolves and eventually reaches a dynamically 
stable regime, where the average fraction b t of nodes with 
value 1 no longer changes, except for stochastic fluctua- 
tions which vanish for a large system size [28| l33j. In 
the absence of noise (P = 0) there are only two possible 
attractors (if the network is sufficiently well connected) 
where all nodes have the same value, which can be either 
or 1. We will consider these homogeneous attractors 
as representing the "correct" dynamics, and denote the 
deviations from them as "errors." More specifically, and 
without loss of generality, we will name the value of 1 as 
an "error," and the value of b t as the average error on the 
system. 

The steady- state fraction of errors 6* = lim^oo b t (for 
bo = 0) will increase with P. For any network with a 
finite average in-degree there will be a critical value of 
noise P* for which the dynamics undergoes a phase tran- 
sition, and the value of 6* reaches 1/2, and remains at 
this value for P > P* [27J. The value 6* = 1/2 is special, 
since it means that the dynamics lost the memory of its 
initial state, since any other initial value of bo (including 
bo > 1/2) would lead eventually to this same value of 6*. 
Therefore, the value of P* marks the transition from a 
nonergodic to an ergodic dynamics. Robustness against 
noise is synonymous with nonergodicity, since only in 
this regime are dynamical correlations not destroyed over 
time. 

BNs with majority functions serve as a paradigmatic 
model for networks robust against noise, since they are 



composed of optimal elements, and they show a minimal 
dynamical behavior in the absence of noise, namely two 
homogeneous attractors with {c^} = or 1. If robust- 
ness cannot be attained for such a system, it is much 
less likely to be possible for a different system with a an- 
other choice of Boolean functions, or displaying a more 
elaborate dynamical pattern [28] . 

In this work we will consider the value of the steady- 
state average error 6* as the main fitness criterion gov- 
erning the survival probability of an organism, since it di- 
rectly measures the deviation from the situation without 
noise. Although the phenotype itself, i.e. the dynamics 
without noise, does not change during the evolutionary 
process considered here, its stability, as measured by 6*, 
does. This translates into an actual fitness criterion, since 
it is not enough for phenotypes to exist; they must also 
be stable against perturbations. If they are not, they are 
not viable in practice, and thus should not be observed. 

III. EVOLUTIONARY DYNAMICS 

We suppose that a given BN represents the genotype 
of a full organism, which can self-replicate and belongs 
to a population that is subject to an evolutionary pres- 
sure. The number of individuals in the population is 
assumed to be sufficiently large and constant. Individ- 
uals replicate a given number of times with a constant 
rate. Parents die the moment they replicate. The off- 
spring are always initially identical to their parents, but 
are individually subject to point mutations represented 
by the matrix /i^ , which defines the probability of mu- 
tating from genotype (i.e. network) i to j. The offspring 
survive with probability a^, given by the Boltzmann se- 
lection criterion 

a^oce^S (3) 

where fa is the fitness of genotype i. The parameter 
j3 controls the selective pressure: For large values of /3 
only the very best networks survive, whereas for smaller 
values most networks do. As mentioned previously, the 
fitness of a network will be given by the fraction of ones 
("errors") after a sufficiently long time, b*^ = lim^oo b[ l \ 

for b^ = 0, as 

U = -Nb* {i) . (4) 

Thus, the largest fitness a network can have is fa = 0, 
which should be possible only if there is no noise (P = 0) . 

We suppose that the global offspring mortality rate 
is such that the size of the population always remains 
constant. If we consider that the dynamics occurs in 
discrete time steps, we can write the probability 7Ti(t) of 
finding an individual in the population with genotype i 
at time t as a Markov chain, 

7Ti(t) = aj^TTjit - VjUji. (5) 
3 
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The mutation probabilities fiji have a decisive effect on 
what topologies emerge. Mutations in actual biological 
systems may result in topological bias, such as gene dupli- 
cations, which are not reversible and result in networks 
with broad degree distributions [34, 35J. However, the 
central aim here it to obtain the most likely topology that 
arises due to the selective pressure alone. For this reason 
we are more interested in mutations which will lead to 
all possible networks with equal probability in the ab- 
sence of selective pressure (i.e. ergodicity). A simple and 
conventional choice is reversible mutations, /i^- = /i^, 
for which the steady state 7^ = lim^oo 7^ (t) obeys the 
detailed balance condition: Hiii^a^ — TTjfijidi. This is 
a sufficient condition for the desired ergodicity property, 
but it is not strictly necessary, since other types of mu- 
tations may also fulfill it. However, from this condition 
we easily obtain that the steady-state probability of find- 
ing an individual with genotype i is given by its survival 
probability, 



(6) 



where Z = 



oPfi 



. This corresponds ex- 



actly to a Gibbs ensemble of all possible genotypes, with a 
partition function given by Z, where Nb*^ plays the role 
of the "microstate energy" and f3 is the "inverse temper- 
ature" (these are of course only mathematical analogies, 
since these quantities do not actually represent a physi- 
cal energy and temperature, respectively). The average 
intensive "energy" in the ensemble is thus 



■PNbfo 



(7) 



and the canonical entropy is 

S = -^Tr^lnTr; = lnZ + f3Nb*. (8) 

The objective is to obtain not only b* for a given /3, but 
also the network topologies which characterize the en- 
semble. Instead of considering all microstates individ- 
ually (i.e., all possible networks) and computing Eqs. 7] 
and [8] directly, we may parametrize the whole ensemble 
via some macroscopic variables {xj} which sufficiently 
describe its topological properties. These variables must 
be chosen so that it is possible to write both b*({xj}) and 
S({xj}) as functions of these variables alone. The en- 
tropy can, for instance, be obtained via the microcanon- 
ical formulation 



S({x j })=\nn({x j }), 



(9) 



where Q({xj}) is the number of different networks given 
a macroscopic parametrization {xj}. The values of {xj} 
which correspond to thermodynamic equilibrium [i.e., the 
steady state of Eq. [5] can be obtained by minimizing the 

"free energy", 



r{{x s })=Nb*({x s })-S{{x s })/(3, 



(10) 



with respect to {xj}. This stems from the principles of 
maximum entropy and minimum energy, for closed sys- 
tems with fixed energy and entropy, respectively, which 
need to hold in thermodynamic equilibrium [36]. It 
should again be emphasized that the theory so far is only 
a mathematical tool, which, although exact, says nothing 
about the actual physical thermodynamical properties of 
the evolved systems, i.e., they have no relation to an 
actual measurable energy or temperature. Instead, the 
minimization of Eq. [l0]is entirely analogous to obtaining 
the steady state of Eq. [5] by any other means. However, 
this approach, together with an appropriate topological 
parametrization, allows us to obtain the outcome of the 
evolutionary process on the population, without having 
to actually implement any dynamics. As will be de- 
scribed in detail in the next section, we will parametrize 
the ensemble as a general stochastic blockmodel, which al- 
lows for a wide range of topological configurations, while 
at the same time allowing for a tractable computation of 
6* and 5, which then can be used to minimize T . 

It should also be mentioned at this point that we are 
interested in the properties of typical networks in the en- 
semble when the selective pressure f3 is varied, under the 
restriction that the average number of inputs per node 
(the average in-degree) (k) is always the same. As men- 
tioned in the introduction, this restriction originates from 
the assumption that a larger number of inputs increases 
the putative cost for the organism of realizing a regula- 
tory mechanism which depends on more elements. Thus, 
the value of (k) should on its own impact the fitness of 
the organism, and should also be subject to natural se- 
lection. For simplicity, we do not describe the fitness 
landscape which depends on (k) and its evolution, in or- 
der to emphasize the effects of robustness against noise 
alone. Instead, we consider (k) as an external parame- 
ter, which essentially means that the fitness pressure on 
(k) supersedes that of the other parameters, such that 
it cannot change during evolution. In this way, we are 
implicitly considering the cost associated with the robust- 
ness achieved by increasing (k): the smaller is the value 
of (k) chosen, the larger is the implied fitness penalty of 
having more connections. 



IV. STOCHASTIC BLOCKMODEL 

Simultaneous consideration of all possible networks 
with a given (k) is a tremendous task, due to the gi- 
gantic number of diverse configurations which are pos- 
sible. For arbitrary networks the computation of 6* ac- 
cording to Eq. [T] may be very cumbersome, since it may 
depend on many degrees of freedom. Therefore, we nar- 
row down the allowed subset of possible network topolo- 
gies to those which can be accommodated in a stochastic 
blockmodel [37-39J [40 . As will become clear in the fol- 
lowing, we do so without sacrificing the generality of the 
approach, since we can progressively add to this model as 
many degrees of freedom as we desire, and in this way ob- 
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FIG. 1. (Color online) Example of a network corresponding 
to a blockmodel with five blocks. The vertices of each block 
are labeled with the same color. 



tain arbitrarily elaborate structures in a controlled fash- 
ion. 

A stochastic blockmodel assumes that the nodes in the 
network can be partitioned into discrete blocks, such that 
every node belonging to the same block has (on average) 
the same characteristics. Hence, for very large systems, 
we need only to describe the degrees of freedom associ- 
ated with the individual blocks (see Fig.[T]). By consider- 
ing a system composed of many blocks, we can describe 
a wide array of possible topological configurations. 

More precisely, a (degree-corrected |39j) stochastic 
blockmodel is a system of n blocks, where W{ is the frac- 
tion of nodes in the network which belong to block i (we 
have therefore that ^ wi = 1), and p\ is the in-degree 
distribution of block i. Thus, the average in-degree is 
given by (k) = ^2 k i kp l k Wi. The matrix describes 
the fraction of the inputs of block i which belong to block 
j (we have therefore that Y^j w j^i = !)• Since the out- 
degrees are not explicitly required to describe the dy- 
namics (see Eq. [TT] below), they will be assumed to be 
randomly distributed, subject only to the restrictions im- 
posed by Wi and Wj^n. 

We note that, although we have diminished the class 
of networks which will be accessible by the evolutionary 
algorithm, we still allow a very large array of possible con- 
figurations, which can in principle incorporate arbitrary 
in-degree distributions, degree correlations [41J, assorta- 
tive or disassortative mixing [42], and community struc- 
ture [43J, to name only a few properties. As will become 
clear in the following section, this blockmodel is sufficient 
to characterize the most important topological property 
that is relevant for robustness against noise, which is the 
formation of densely connected central subgraphs. 



A. The value of b* for a blockmodel 

Supposing that the number of vertices Nwi belonging 
to each block i is arbitrarily large, we can compute the 
value of b* using an heterogeneous version of the annealed 
approximation [44J, by supposing that at each time step 
the inputs of each function are randomly chosen, such 
that the specified block structure given by Wi->j is al- 



ways preserved [55]. If the number of vertices in each 
block is large enough, we can expect this approximation 
to become an exact description for quenched networks as 
well. We can then write the average value of bi for each 
block over time as 



&i(t+l) 



k 



plm k (1 - 2P) 



j 



Mt) + p 



(ii) 

which is a system of n coupled maps, where m k (b) is the 
probability that the output of a majority function will be 
1, if the inputs are 1 with probability 6, and is given by 



m k (b) 



k 

£ 

= \k/2] 



b i (l-b) k 



(12) 



A fixed point of Eq. [TT] represents the solution of a poly- 
nomial system of arbitrary order, and therefore cannot 
be written in closed form. However, it can be obtained 
numerically by starting the system at bi = 0, and iterat- 
until a fixed-point {6*} is reached. The value 
can then be obtained as 6* = Wib*. 



ing Eq. 
of b 



B. Blockmodel entropy 

We obtain the entropy of the stochastic blockmodel 
ensemble [45, 46J by enumerating all possible networks 
which are compatible with a given choice of w^, p k and 



w 



. To make the counting simpler, we ignore the dif- 
ficulty of forbidding parallel edges, and consider the en- 
semble of configurations, since the occurrence of parallel 
edges should vanish for large network sizes (see [46J for 
more details). Later we compare the results obtained 
with Monte-Carlo simulations with parallel edges forbid- 
den, and we find very good agreement. 

We begin by enumerating all possible in-degree se- 
quences of each block which correspond to the prescribed 
in-degree distributions, 



(Nwi)\ 



n k {N WiV {)v 



(13) 



For a given block i with a fixed in-degree sequence, we 
can count the number of different input choices as 



where Ei = Nwiki is the total number of inputs belong- 
ing to block i and Ej^n = Wj^Ei is the total number 
of inputs from block i which belong to block j. Since 
the set of inputs of each function is unordered, we still 
need to divide the whole number of input combinations 
by Yl k (k\) Nk , where N k = A 7 '^ i k Wip\ is the total num- 
ber of vertices with in-degree k. Putting it all together 
we have 



Q = tt 



U k (k\) Nk ' 



(15) 
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Taking the logarithm of this expression, and the limit 
N ^> 1, and using Stirling's approximation, we obtain 
the full entropy (up to a trivial constant term, which is 
not relevant to the minimization of the free energy), 



S/N=(k) lnN + J2 w i S i 

i 

— Wiki Wj^i In 



Wj 



, (16) 



where fif is an entropy term associated with the degree 
distribution of block z, and is given by 



5 " = -E^( ln ^+ lnfc! )- 



(17) 



C. Choice of single-block in-degree distribution 

We want to constrain the number of degrees of freedom 
in the model, such that only the average in-degree ki of 
each block is specified, not the entire distribution. In this 
way, graphs with many different global in-degree distribu- 
tions are still possible by composing different blocks with 
different fc^'s, but we have a finite number of degrees of 
freedom per block. In order to obtain the in-degree distri- 
bution of the individual blocks, we maximize the entropy 
<S, with the restriction that the average in-degrees are 
fixed. For that, we construct the Lagrangian, 



A 



E 



K ( E k Pk 




(18) 

where {A^} and {/^} are Lagrange multipliers which keep 
the averages and the normalizations constant. We note 
that the sum over k is made only over odd values of fc, 
due to the imposed restrictions on the majority function. 
Obtaining the critical point ({Jg-}, {f^}, {§±}) = 0, 



and solving for {p l k } one obtains, 



Pk 



1 A? 
sinhA ? - kl ' 



(19) 



where ki = A*/ tanh A^ . Equation [l9] is a Poisson dis- 
tribution, which is defined and normalized only for odd 
values of k. 

This choice of p\ is not necessarily the optimal one. 
In fact, it is possible to show that single- valued distri- 
butions with zero variance tend to provide the best er- 
ror resilience [28J. Nevertheless, the improvement over a 
Poisson distribution is very small, and the definition of 



Eq. [19] allows for the average ki to be continuously var- 
ied, which is very practical for the optimization of the 
free energy. 



D. Block splitting, decrease of entropy and the 
necessary number of blocks 

For the blockmodel defined in this section, there are 
2n + n 2 variables which define the topology, where n is 
the number of blocks. In order for arbitrary topologies to 
be faithfully represented by the model, one would need 
to make n — >• oo, which would render this approach im- 
practical. However, we will show that for the purpose at 
hand, only a minimal number of two blocks is sufficient 
to fully characterize the evolutionary process, without re- 
lying on any approximations. This is due the following 
two facts: 1. Any possible value of 6* can be obtained 
with only two blocks; 2. Any other topology with the 
same 6* will invariably have a lower entropy, and thus a 
larger free energy. Thus the minimum of the free energy 
will always lie on a two-block structure. 

The first fact can be shown by construction: Consider 
a system of two blocks, where one of them (the "core") is 
smaller and much denser, and the remaining block has an 
average in-degree close to the global average. The inputs 
of the core block belong mostly to the core itself, as well 
as the inputs of the remaining block. By changing the 
density of the core block, as well as the degree of input 
segregation, it can be shown [28J that any possible value 
of b* can be achieved [47] . 

The second fact can be shown by considering a sys- 
tem of many blocks, and selecting any two blocks, / and 
m. If all other blocks are kept intact, it can be shown 
that the entropy will always be larger if these two blocks 
are merged into an effective single block. This can be 
shown by partially maximizing the entropy S via the La- 
grangian, 



A = S 




(20) 

where \i and {7^} are Lagrange multipliers which 
keep the average in-degree and the normalization of 
Wj^i fixed, respectively. Obtaining the critical point 
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dA 



dw 
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}) = 0, and solving 



for wi\ m , ki\ m , {wj^i\ m }, {wi\ m ^j} one obtains, 
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(21) 
(22) 

(23) 



This corresponds to the situation where the nodes from 
blocks I and m are topologically indistinguishable, i.e. 
the outgoing and incoming edges are randomly dis- 
tributed among the nodes of both blocks. Since any 
arbitrary many-block structure can be converted into a 
single-block by successive block merges, it follows di- 
rectly that any arbitrary many-block structure can be 
constructed by starting from a single block, and succes- 
sively splitting blocks. Thus, since the merging of blocks 
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always increases entropy, and the splitting decreases it, 
the entropy of the final structure must be smaller than 
that of both the initial single block and the succeeding 
two-block network. 

In order for a many-block structure to have a lower free 
energy than the two-block structure with the same value 
of 6*, it needs to have a larger entropy. But according 
to the above argument, networks with a larger number 
of blocks tend to have smaller entropy. Networks with 
larger entropy and number of blocks would have to be 
more randomized than the two-block structure, which 
would invariably result in a larger value of 6*. We can 
therefore conclude that the global minimum of the free 
energy always occurs for a two-block structure, and thus 
we need to deal with only eight variables [48J. 



V. MINIMIZATION OF THE FREE ENERGY 

Although we have an analytical expression for the en- 
tropy <S, the value of 6* cannot be obtained in closed form, 
and thus the same is true for T . Therefore we must re- 
sort to a numerical computation of 6*, via the iteration 
of Eq. [TT] and minimize T with a gradient descent al- 
gorithm, using finite differences. Many of these methods 
work only for unconstrained optimization problems, and 
we need to impose several constraints: The average in- 
degree must be fixed, and the Wi and Wj^i distributions 
must be normalized. However we can make the problem 
unconstrained by using the following transformations, 



Wi = 



Wj^i = 



E 3 e** 
Ei e*'- 



(24) 



(25) 



where Wi and w 3 



are unconstrained real variables. 
Likewise we can transform A 7 - as 



ki — 



tanh e Xi 
ki = cki+j 
Xi = ki tanh Xi 



(26) 

(27) 
(28) 



where A^ is also an unconstrained real variable. To obtain 
Xi from Eq. [28] it is simply iterated until it converges to 
the correct value, within some desired precision. The 
values of c and 7 are chosen to force ki > 1 and the 
average in-degree to a prescribed value (k), 



(k) 



c = 



Ei k i w i 
(k)~ 



Ei k i w i 



7 = 0, 



7 = 1 — ck n 



if k m > 1 



otherwise, 



(29) 



where k m = min({/^}). Thus we have obtained an un- 
constrained minimization problem of function T with re- 
spect to the variables {wi}, {wj^i}, {Xi}. 



In order to find the minimum of Eq. [TUJ we employed 
the L-BFGS quasi-Newton algorithm [49J, with the gra- 
dient computed by finite differences. 



VI. STRUCTURAL PHASE TRANSITION 

The minimization of the free energy leads to one of 
two possible structures (see Fig. [2J: 1. For low values 
of j3 the topology is a fully random graph; 2. For larger 
values of j3 there is the emergence of a segregated core 
structure, where one of the blocks has a larger in-degree 
density and is more responsible for the regulation of the 
whole network. 

In order to precisely characterize the phase transition, 
we define the following order parameter, 



b r 



(30) 



where b r is the value of 6* for a fully random network, 
and b m { n is the smallest possible value of 6* for a given 
(k) [28J, given by 



k 



Pkm k (P), 



(31) 



where p k is given by Eq. 19 
therefore that <\> G [0, 1] and if 



with ki = (k). We have 
6 = the network ensem- 



ble must be fully random, and if <p = 1 it has the largest 
possible value of fitness. 

As can be seen in Fig. |3J there is a second-order phase 
transition at a critical value /?*, which depends on the 
noise level P. The dependence of (3* on P divides the 
/3 x P phase diagram into an upper and lower branch, 
as can be seen in Fig. [4] The branches are divided at a 
value of P = P*, which is the critical value of noise of a 
fully random network (see [28J for an exact calculation of 
P*). At this value of noise, a random network undergoes 
a dynamic phase transition, where the steady state error 
fraction reaches the maximum level 6* = 1/2, and the 
dynamics become ergodic, as was described previously. 
For P < P*, random networks are intrinsically robust, 
since 6* < 1/2, and the critical value f3* becomes larger 
with smaller P. In other words, the smaller is the value 
of noise P, the better is the behavior of fully random 
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FIG. 2. (Color online) Structural phase transition observed 
when varying the selective pressure /3, as described in the text. 
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FIG. 3. (Color online) The order parameter as a function 
of the selective pressure /?, for different noise levels P, and for 
(k) = 5. The left panel shows curves for P < P* , where P* is 
the critical value of noise for a fully random network, and the 
right panel shows curves for P > P*. The curves on the left 
panel are shown in order of increasing P from right to left, 
and on the right panel, from left to right. 



networks, such that the entropic cost of providing fur- 
ther improvement by creating a segregated core becomes 
larger, which therefore occurs only at larger values of 
selective pressure. The situation changes for P > P*. 
Since random networks are no longer resilient, and have 
collapsed onto b* = 1/2 (see Fig. [5|, a segregated core 
provides a significant improvement, for a relatively low 
entropic cost. This cost increases with P, since the core 
needs to be either denser, smaller or more isolated to pro- 
vide the same benefit under larger noise. Thus the value 
of f3* also increases with P. Interestingly, in the vicin- 
ity of P = P*, the value of (3* tends to zero. For this 
value of noise, the dynamics of the fully random topology 
lies exactly at the critical point where 6* = 1/2, and even 
the smallest (structural) perturbation can move the fixed 
point appreciably. Since such small structural perturba- 
tions have negligible entropic cost, the value of /3* van- 
ishes to zero. Thus, networks with (k) such that P * = P 
are the most easily evolvable. 

The value of 6* itself can be seen in Fig. [5] The upper 
branch P > P * corresponds to transitions from b* = 1/2 
(ergodic dynamics) to b* < 1/2 (nonergodic dynamics), 
whereas the lower branch P < P * shows b* < 1/2 for 
both phases. 

The topological properties of each phase can be seen 
in detail in Fig. [6j where are shown the average in- 
degrees {/^}, block sizes {wi} and the total fraction 
of inputs which lead to the segregated core, E c = 
J2j w c ^jWjkj/(k), where c is the core block. The core 
block emerges at /3 = /3*, with an infinitesimal size, but 
with a value of k{ which is usually significantly above av- 
erage. For P > P* the segregated core usually has a sig- 
nificantly larger average in-degree than for P < P*. The 
dominance and segregation of the core block increases 
continuously with /3, reaching values close to E c = 1, for 
larger values of /?, especially for values of P > P*. 

Each network on the evolved ensemble has a critical 
value of noise P* (different from the value of P for which 
it was evolved), for which its dynamics undergoes the 
aforementioned ergodicity transition and which repre- 
sents the maximum tolerable noise (see [28] for an exact 




FIG. 4. (Color online) The order parameter as a function 
of the selective pressure /3 and noise P, for different values of 




FIG. 5. (Color online) Left: Value of the steady-state average 
error b* as a function of the selective pressure /3 and noise P, 
for (k) = 5. Right: Maximum tolerable noise P*, as a function 
of the selective pressure f3 and noise P, for (k) = 5. 



calculation of P* for arbitrary blockmodels). Interest- 
ingly, the evolution of b* does not automatically result 
in larger values of P*, as is shown in Fig. [5] Some en- 
sembles evolved under larger selective pressure possess a 
lower value of P* than others evolved under lower selec- 
tive pressure (for the same value of P under evolution). 
This means the evolution is reasonably specialized for the 
level of noise it is under, and the behavior of the networks 
under larger values of noise for which they were evolved 
is not automatically better than that of other networks 
with smaller fitness. However, despite these deviations, 
the general tendency is that, for larger values of /?, 6* 
and P* are decreased and increased, respectively. 
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FIG. 6. (Color online) Block average in-degrees {fe}, block 
sizes {wi}, and total fraction E c of inputs originating from the 
core block, as functions of selective pressure /3, for (k) = 5. 
The left panels show curves for P < P* , where P* is the 
critical value of noise for a fully random network, and the 
right panels show curves for P > P* . All curves on the left 
panels are shown in order of increasing P from right to left, 
and on the right panel, from left to right. 



VII. MONTE-CARLO SIMULATIONS 

We have also performed Monte Carlo simulations to 
observe the phase-transition obtained in the previous sec- 
tion. We employed the Metropolis-Hastings [501 EI] algo- 
rithm, starting from a random network with TV vertices, 
with a given average in-degree (k) and a partition into 
n blocks, represented by assigning block labels to each 
vertex (which is initially randomly chosen). At each iter- 
ation, one of the following moves is attempted with equal 
probability: 

1. Block label move: A random vertex v is chosen, 
and its block label is randomly chosen among all n 
possible values. 

2. Input move: A vertex v is chosen with probabil- 
ity p oc k(k — 1), where k is the in-degree of v. 
Another vertex u is randomly chosen with uniform 
probability. Two random inputs from v are deleted 
and moved to u. 

3. Source move: A random vertex v is chosen. A ran- 
dom input from v is deleted and replaced by a ran- 
domly chosen one. 
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FIG. 7. (Color online) Average order parameter (0) and 
steady-state error level b* as functions of the selective pres- 
sure /3, for different values of noise P and (k) = 5, obtained 
with Monte Carlo simulations, for network sizes shown in the 
legend. On the right plot, the red star symbols (★) correspond 
to empirical values of b* as obtained with Eq. [I] The solid 
gray lines are theoretical values obtained by minimizing the 
free energy. 
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FIG. 8. (Color online) Block average in-degrees {ki} and sizes 
{wi}, as functions of selective pressure f3, for (k) = 5, obtained 
with Monte-Carlo simulations, for network sizes shown in the 
legend. The solid gray lines are theoretical values obtained 
by minimizing the free energy. 



A move is rejected if it generates parallel edges or self- 
loops. The difference A6* of the value of 6* after and 
before the move is computed. The move is then accepted 
with a probability p a given by 



Pa 



1 



-f3NAb* 



if A6* < 0, 
otherwise. 



(32) 



The probability p oc k(k — 1) in move (2| is chosen to cor- 
respond to two independent single-edge moves affecting 
the same vertices v and u, where in each move a random 
edge is chosen, and its target is moved to a randomly cho- 
sen vertex. This guarantees that there is no topological 
bias, and that the in-degrees are always odd. 

The value of b* is computed by obtaining the values 

Of {Wi}, {Wj- 
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This is 



i] and {ki}, and iterating Eq. 
much faster than actually measuring the error level on 
the network, and produces deterministic values 1 52J. 

Since we have employed the block label move pi), which 
tends to partition the network evenly into n blocks of 
equal sizes, we have included an entropic cost associated 
with the size of a block, which did not exist in the original 
blockmodel above. In the original model, the partitions 
themselves are not relevant, and only the resulting graph 
topology contributes to the entropy. However, move (TJ 
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FIG. 9. (Color online) Left: Block average in-degrees {fc}, as 
a function of selective pressure /3, for (k) = 5, and obtained 
with Monte- Carlo simulations, for N = 10000 and different 
number of blocks. The solid gray lines are theoretical values 
obtained by minimizing the free energy. Right: Evolution in 
time of the average in-degrees {ki} in a Monte-Carlo simula- 



tion with n = 20 blocks, N = 10000 and f3 
the eventual merging into only two blocks. 



10 , showing 



makes the algorithm very efficient and easy to implement, 
and it should not fundamentally change the results. But 
in order to compare with the theory, we need to include 
the following correction in the number of possible net- 
works: 



which leads to the slightly modified entropy, 
S'/N = S/N - ^2 w i ln w i ■ 



(33) 



(34) 



In Fig. [7] we can see the same phase transition observed 
previously, which matches very well the theoretical pre- 
dictions. In Fig. [8] the topology can be assessed more 
closely, and the emergence of the segregated core is clear. 
Due to the partition entropy introduced in Eq. |34j the 
core does not vanish at the transition; it merges continu- 
ously with the other block instead. However, the critical 
value (3* is identical with the non-modified model. 

The inclusion of the partition entropy also introduces 
the fact that different solutions are obtained for a dif- 
ferent number of blocks, since this has a direct effect on 
the preferred sizes of the blocks (see Fig. [9j left). How- 
ever, this does not change the fact that for any number of 
blocks the preferred topology will always be an effective 
two-block structure. This follows from the argumenta- 
tion presented previously based on the reduction of en- 
tropy resulting from block splits, and can be observed in 
simulations with many blocks, as shown in Fig. [8| which 
shows a comparison between the topologies obtained with 
two and three blocks, as well as the outcome of a typi- 
cal simulation with 20 blocks, which shows the eventual 
collapse of into an effective two-block structure. 



VIII. GENE REGULATORY NETWORKS 

Here we make a comparison with some features ob- 
served in actual gene regulatory networks. We consider 




FIG. 10. (Color online) Gene regulatory networks for Sac- 
charomyces cerevisiae (left) and Escherichia coli (right), ex- 
tracted from the YEASTRACT [53] and RegulonDB [54] 
databases, respectively. The nodes in purple (towards the 
middle) are transcription factor genes, and are the only ones 
with out-degree larger than zero. 



the networks for Saccharomyces cerevisiae (yeast) and 
Escherichia coli, extracted from the YEASTRACT [53] 
and RegulonDB [54J databases, respectively. We are in- 
terested in extracting the "functional core" of the net- 
work, i.e. those nodes which are solely responsible for 
global regulation, like those belonging to the segregated 
core which emerges in the phase transition observed in 
the evolutionary process above. We will characterize the 
core nodes in two ways: 1. Nodes which have an out- 
degree larger than zero; 2. Nodes which belong to a 
strongly connected component (SCC) of the graph (i.e. 
the maximal set of nodes which can directly or indirectly 
regulate each other). The first criterion is a necessary 
condition, since if the out-degree is zero, then a node is 
not a regulator. The second criterion is stronger, since 
even if a node is a regulator, it can have its dynamics com- 
pletely enslaved by other nodes. The nodes in the SCC 
are exactly those which are not necessarily enslaved, and 
can mutually regulate each other. Without a least one 
SCC in the network, an autonomous behavior with dy- 
namical attractors other than simple fixed-points is not 
possible. 

The yeast network is composed of N = 6402 nodes, 
with an average in-degree of (k) ~ 7.51. The E. coli 
network is smaller and sparser, with N = 1658 and 
(k) ~ 2.32. In both networks the number of transcription 
factor (TF) genes is much smaller than the total number: 
A TF = 182 in yeast, and A TF = 154 in E. coli. These 
are core genes according to the first criterion, since they 
have an out-degree larger than zero, as can be seen in 
Fig. [To] 

In yeast, the average in-degree of the core nodes is 
higher than average, (k) c « 10.03, as observed in the 
segregated core phase of the evolved networks obtained. 
For the SCC, the number of nodes decreases slightly to 
^scc = 146, and the average in-degree changes negli- 
gibly (k) cc « 10.48 (if one counts only edges between 
vertices of the SCC, this value is virtually identical, 
(k) cc « 10.42). This is similar to what was found pre- 
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viously in (TB] for the yeast network (using an older, and 
less complete dataset with only 837 genes). They have 
also found that the TF genes have different connection 
patterns, and those with the largest out-degree tend to 
regulate genes with lower than average in-degree. How- 
ever they did not find that the TF genes form a denser 
subgraph, with an larger than average in-degree, which 
is most likely due to the incompleteness of the dataset 
used. Very similar numbers to those presented here were 
obtained more recently in [55], using a more complete 
dataset (which is not identical to the one used in this 
work) . 

For E. coli the situation changes somewhat: The aver- 
age in-degree of the transcription factor nodes is (k) c ~ 
1.97, which is in fact lower than the global average. How- 
ever, if one extracts the largest SCC, the number of nodes 
drops dramatically to Nscc — 8. These nodes are re- 
sponsible for the regulation of 411 genes. A majority of 
1093 genes are instead enslaved to the dynamics of SCC 
with only two mutually regulating nodes. Although the 
largest SCC does have an average in-degree (k) cc = 6, 
the core topology seems significantly more sparse than 
for yeast and the evolved networks, at least with the 
data currently available [SB]. Arguably, such a sparse 
regulating core is suspect from the point of view of data 
set completeness, since it would mean that the range of 
dynamical behavior for the regulatory network is very 
restricted. As previously mentioned, and older and less 
complete data set for yeast also did not reveal a denser 
regulating core [16J. Nevertheless, one should also con- 
sider that such real networks are simultaneously under 
different, possibly competing selective pressures which 
also influence the resulting topology, robustness against 
noise being only one of them. These other factors, which 
are neglected in the model, could be one reason for such 
a discrepancy. We emphasize, however, that although 
apparently it is not denser, a regulating core certainly 
exists in the measured E. coli network, which is at least 
in partial qualitative agreement with what is observed in 
the model. 

There are other factors that may contribute to this ob- 
served segregation which are not in principle related to 
noise resilience. For instance, non-regulating genes exist 
mostly to transcribe proteins which have some specific 
metabolic or structural function in the cell, and it may 
be difficult for these proteins to have a dual role as tran- 
scription factors, and therefore become specialized (al- 
though non-specialized proteins are not impossible, since 
a protein can in principle bind both to DNA and to other 
proteins). Nevertheless, there are good reasons to con- 
sider robustness to noise as a very plausible driving force 
toward this type of topology. This is corroborated, for 
instance, by evidence that core TF genes tend to be less 
noisy [57| l58]. and that the vast majority of TF genes in 
yeast are not vital for the survival of the cell if repressed 
in isolation [59J. This is fully compatible with the idea 
of a highly redundant functional core, which provides ro- 
bustness for the rest of the network. 



Another feature which is commonly investigated in em- 
pirical networks is the in- and out-degree distributions. 
The in-degree distribution is often narrow, while the out- 
degree distribution is broader, and as some suggest [16J, 
compatible with a power law. The model considered in 
this work is parametrized as a stochastic blockmodel, 
where each block has in- and out-degrees that are Pois- 
son distributed. When the segregated core emerges, the 
system is composed of only two blocks, thus both the in- 
and out-degree distributions are bimodal. The in-degree 
distribution is indeed narrower, since the difference be- 
tween the average in-degree of the two blocks is not very 
large for most networks obtained. The out-degree distri- 
bution is also much broader, since the average out-degree 
of the non-core block tends to zero, while for the core 
block it tends very rapidly to infinity, when the selective 
pressure is increased. However, the homogeneous and 
seemingly scale-free properties of the empirical distribu- 
tions are not reproduced by the model. This implies that 
these features are not a direct result of evolved robust- 
ness against noise, and may, for instance, be due simply 
to mutational bias caused by gene duplication, which is 
known to qualitatively reproduce these types of in- and 
out-degree distributions 155], 



IX. CONCLUSION 

We have investigated the effect of selective pressure 
favoring robustness against noise on the structural evo- 
lution of Boolean networks with optimal majority func- 
tions, functioning as a conceptual model for gene regula- 
tion. We have mapped the evolutionary process onto a 
Gibbs ensemble, and obtained its outcome by minimizing 
the associated free energy. We showed that the structural 
properties of the system undergo a phase transition at a 
critical value of selective pressure, from a random topol- 
ogy to a segregated-core structure, where a smaller frac- 
tion of the nodes form an isolated core, which is denser 
than the rest of the network and is responsible for most 
of the regulation. Since the core is denser, its nodes can 
profit from more regulatory redundancy, which greatly 
diminishes the effect of noise. This robustness is propa- 
gated to the rest of the network, which relies on the core 
for most of the regulation. The segregated core becomes 
denser, smaller, and more isolated as the selective pres- 
sure increases. We have compared the theoretical predic- 
tions with Monte- Carlo simulations of actual networks, 
and found perfect agreement. 

We have also shown that this segregated-core struc- 
ture is present in the gene regulatory network of yeast 
and E. coli. Both networks are composed of a much 
smaller fraction of transcription factor genes which are 
responsible for all regulation. In yeast, the existing core 
structure is very similar qualitatively to the outcome of 
the evolutionary process considered, with transcription 
factor genes forming a denser subgraph, with an average 
in-degree above the average for the whole network. In E. 
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coli the isolated transcription factor core is composed of 
few very small regulating cores (strongly connected com- 
ponents), the largest of which has only eight nodes. We 
conjecture that such a sparse regulating core is possibly 
due to data set incompleteness, since it would severely 
restrict the range of possible dynamical behavior for the 
network. A less complete data set for yeast also did not 
show a denser regulating core structure |16| . although it 
is clearly seen with more up-to-date datasets including 
more genes and interactions [55J. However, one should 
not rule out other selection criteria which are not incor- 
porated in the model. 

It should also be noted that regulating cores of tran- 
scription factors are a common feature of other organ- 
isms, such as Mycobacterium tuberculosis [60J. Addition- 
ally, a similar (but not identical) "bow-tie" structure was 
also observed in the mammalian signal transduction net- 
work [61, 62J, where most pathways are funneled through 
a central core. 

It is possible to formulate other reasons for the ex- 



istence of such a core structure, such as a forced spe- 
cialization of genes into either transcription factors or 
target genes. Furthermore one should mention that the 
effects of noise are not always detrimental, and can in 
some circumstances even be beneficial [TT| [63] . Never- 
theless there is compelling evidence that the core genes 
provide a degree of robustness to the cell. Not only are 
the best-connected TF nodes less noisy [57, 58J, they are 
usually found — if removed individually — not to be vi- 
tal for cell survival [59J. This corroborates the idea that 
one of the major functions of the regulating core is to 
provide robustness via redundancy. 

Furthermore, aside from the direct applicability to 
gene regulation, we have identified a fundamental mech- 
anism of robustness against noise, which emerges natu- 
rally when networks are randomly selected for that pur- 
pose [64J. Although most interesting systems require 
more than just robustness for their functioning, it is 
reasonable to conclude that the emergence of regulating 
cores is to be expected when there is enough selective 
pressure favoring noise resilience. 
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